19 research outputs found

    Transcription factor family‐specific DNA shape readout revealed by quantitative specificity models

    Get PDF
    Transcription factors (TFs) achieve DNA-binding specificity through contacts with functional groups of bases (base readout) and readout of structural properties of the double helix (shape readout). Currently, it remains unclear whether DNA shape readout is utilized by only a few selected TF families, or whether this mechanism is used extensively by most TF families. We resequenced data from previously published HT-SELEX experiments, the most extensive mammalian TF–DNA binding data available to date. Using these data, we demonstrated the contributions of DNA shape readout across diverse TF families and its importance in core motif-flanking regions. Statistical machine-learning models combined with feature-selection techniques helped to reveal the nucleotide position-dependent DNA shape readout in TF-binding sites and the TF family-specific position dependence. Based on these results, we proposed novel DNA shape logos to visualize the DNA shape preferences of TFs. Overall, this work suggests a way of obtaining mechanistic insights into TF–DNA binding without relying on experimentally solved all-atom structures

    Modular discovery of monomeric and dimeric transcription factor binding motifs for large data sets

    Get PDF
    In some dimeric cases of transcription factor (TF) binding, the specificity of dimeric motifs has been observed to differ notably from what would be expected were the two factors to bind to DNA independently of each other. Current motif discovery methods are unable to learn monomeric and dimeric motifs in modular fashion such that deviations from the expected motif would become explicit and the noise from dimeric occurrences would not corrupt monomeric models. We propose a novel modeling technique and an expectation maximization algorithm, implemented as software tool MODER, for discovering monomeric TF binding motifs and their dimeric combinations. Given training data and seeds for monomeric motifs, the algorithm learns in the same probabilistic framework a mixture model which represents monomeric motifs as standard position-specific probability matrices (PPMs), and dimeric motifs as pairs of monomeric PPMs, with associated orientation and spacing preferences. For dimers the model represents deviations from pure modular model of two independent monomers, thus making co-operative binding effects explicit. MODER can analyze in reasonable time tens of Mbps of training data. We validated the tool on HT-SELEX and ChIP-seq data. Our findings include some TFs whose expected model has palindromic symmetry but the observed model is directional.Peer reviewe

    Structural insights into the DNA-binding specificity of E2F family transcription factors

    Get PDF
    The mammalian cell cycle is controlled by the E2F family of transcription factors. Typical E2Fs bind to DNA as heterodimers with the related dimerization partner (DP) proteins, whereas the atypical E2Fs, E2F7 and E2F8 contain two DNA-binding domains (DBDs) and act as repressors. To understand the mechanism of repression, we have resolved the structure of E2F8 in complex with DNA at atomic resolution. We find that the first and second DBDs of E2F8 resemble the DBDs of typical E2F and DP proteins, respectively. Using molecular dynamics simulations, biochemical affinity measurements and chromatin immunoprecipitation, we further show that both atypical and typical E2Fs bind to similar DNA sequences in vitro and in vivo. Our results represent the first crystal structure of an E2F protein with two DBDs, and reveal the mechanism by which atypical E2Fs can repress canonical E2F target genes and exert their negative influence on cell cycle progression.Peer reviewe

    Binding specificities of human RNA-binding proteins toward structured and linear RNA sequences

    Get PDF
    RNA-binding proteins (RBPs) regulate RNA metabolism at multiple levels by affecting splicing of nascent transcripts, RNA folding, base modification, transport, localization, translation, and stability. Despite their central role in RNA function, the RNA-binding specificities of most RBPs remain unknown or incompletely defined. To address this, we have assembled a genome-scale collection of RBPs and their RNA-binding domains (RBDs) and assessed their specificities using high-through-put RNA-SELEX (HTR-SELEX). Approximately 70% of RBPs for which we obtained a motif bound to short linear sequenc-es, whereas similar to 30% preferred structured motifs folding into stem-loops. We also found that many RBPs can bind to multiple distinctly different motifs. Analysis of the matches of the motifs in human genomic sequences suggested novel roles for many RBPs. We found that three cytoplasmic proteins-ZC3H12A, ZC3H12B, and ZC3H12C-bound to motifs resembling the splice donor sequence, suggesting that these proteins are involved in degradation of cytoplasmic viral and/or unspliced transcripts. Structural analysis revealed that the RNA motif was not bound by the conventional C3H1 RNA-binding domain of ZC3H12B. Instead, the RNA motif was bound by the ZC3H12B's PilT N terminus (PIN) RNase domain, revealing a po-tential mechanism by which unconventional RBDs containing active sites or molecule-binding pockets could interact with short, structured RNA molecules. Our collection containing 145 high-resolution binding specificity models for 86 RBPs is the largest systematic resource for the analysis of human RBPs and will greatly facilitate future analysis of the various bi-ological roles of this important class of proteins.Peer reviewe

    Myt1l safeguards neuronal identity by actively repressing many non-neuronal fates

    Get PDF
    Normal differentiation and induced reprogramming require the activation of target cell programs and silencing of donor cell programs(1,2). In reprogramming, the same factors are often used to reprogram many different donor cell types3. As most developmental repressors, such as RE1-silencing transcription factor (REST) and Groucho (also known as TLE), are considered lineage-specific repressors(4,5), it remains unclear how identical combinations of transcription factors can silence so many different donor programs. Distinct lineage repressors would have to be induced in different donor cell types. Here, by studying the reprogramming of mouse fibroblasts to neurons, we found that the pan neuron-specific transcription factor Myt1-like (Myt1l)(6) exerts its pro-neuronal function by direct repression of many different somatic lineage programs except the neuronal program. The repressive function of Myt1l is mediated via recruitment of a complex containing Sin3b by binding to a previously uncharacterized N-terminal domain. In agreement with its repressive function, the genomic binding sites of Myt1l are similar in neurons and fibroblasts and are preferentially in an open chromatin configuration. The Notch signalling pathway is repressed by Myt1l through silencing of several members, including Hes1. Acute knockdown of Myt1l in the developing mouse brain mimicked a Notch gain-of-function phenotype, suggesting that Myt1l allows newborn neurons to escape Notch activation during normal development. Depletion of Myt1l in primary postmitotic neurons de-repressed non-neuronal programs and impaired neuronal gene expression and function, indicating that many somatic lineage programs are actively and persistently repressed by Myt1l to maintain neuronal identity. It is now tempting to speculate that similar 'many-but-one' lineage repressors exist for other cell fates; such repressors, in combination with lineage-specific activators, would be prime candidates for use in reprogramming additional cell types.Non peer reviewe
    corecore